Skip to content

Conversation

@Chibionos
Copy link
Contributor

Summary

Enables evaluators to execute and produce scores after the agent completes following a resume operation.

Changes

1. Consistent Thread ID for Checkpointing

  • Changed from (unique per run) to (consistent across suspend/resume)
  • This allows LangGraph checkpoints to be found when resuming from suspended state
  • Without this, resume would start with a new thread_id and fail to find the checkpoint

2. Resume Mode Implementation

  • When --resume flag is set, pass Command(resume=data) to continue from interrupt() point
  • Uses mock resume data for testing: {"status": "completed", "result": "mock_completion_data"}
  • In production, orchestrator provides actual result data from external work (RPA process, human input, etc.)

3. Evaluator Execution

  • Previously: Evaluators skipped during suspend (correct) but also couldn't run after resume
  • Now: Agent completes after resume → evaluators run on final output → scores generated

Testing

Before (evaluators don't run on resume):

rm -rf __uipath/state.db
uipath eval agent-simple evaluations/eval-sets/test.json  # Suspends
uipath eval agent-simple --resume  # Agent re-suspends, evaluators still skipped

After (evaluators run and produce scores):

rm -rf __uipath/state.db
uipath eval agent-simple evaluations/eval-sets/test.json  # Suspends
uipath eval agent-simple --resume  # Agent completes, evaluators run ✓

Related PRs

  • feat: pack uv.lock #414 (uipath-langchain-python) - Sample demonstrating suspend/resume
  • TBD (uipath-agents-python) - Integration testing

Architecture

SUSPEND PHASE (Resume mode: False)
  eval_item.id → runtime_id → thread_id
  Agent suspends at interrupt()
  Checkpoint saved with thread_id=eval_item.id
  Evaluators skipped ✓

RESUME PHASE (Resume mode: True) 
  eval_item.id → runtime_id → thread_id (same as suspend!)
  Checkpoint found using thread_id
  Command(resume=data) passed to interrupt()
  Agent completes execution
  Evaluators run on final output ✓

Notes

  • Mock resume data is used for testing; production orchestrator provides actual data
  • Backward compatible: non-suspend scenarios unaffected
  • Thread ID consistency maintained per eval_item across suspend/resume cycles

Chibi Vikram added 5 commits January 15, 2026 07:38
Adds support for suspending and resuming evaluations that invoke RPA
processes. When an evaluation suspends while waiting for an external
job, it can now be resumed after the job completes.

Changes:
- Added SUSPENDED status detection after agent execution
- Added --resume flag to 'uipath eval' command
- Skip evaluator execution for suspended runs (evaluators run on
  resume)
- Pass triggers through evaluation flow to enable resume
- Added comprehensive logging for suspend/resume debugging

Testing done with tool-calling-suspend-resume sample in
uipath-langchain-python PR #414.
This is a critical fix for serverless executor integration.

Problem:
- Inner runtime (agent) returns SUSPENDED status when interrupt() is called
- Evaluation runtime was hardcoding SUCCESSFUL status in the result
- Serverless executor sees SUCCESSFUL and doesn't suspend the job
- State is not saved, resume cannot work

Solution:
- Check all evaluation run results for SUSPENDED status
- Propagate SUSPENDED to top-level UiPathRuntimeResult
- Also handle FAULTED status propagation (FAULTED > SUCCESSFUL, SUSPENDED > FAULTED)

This ensures the serverless executor:
- Detects SUSPENDED status correctly
- Saves checkpoint to blob storage
- Saves trigger to SQL database
- Suspends the job properly
- Can resume when trigger completes

Addresses feedback from @cristian-pufu in PR review.
The check 'if overall_status != UiPathRuntimeStatus.SUSPENDED' was redundant
because we break immediately when SUSPENDED is found, so overall_status
can never be SUSPENDED at the FAULTED check point.

Simplified logic:
- SUSPENDED: set and break (highest priority)
- FAULTED: set and continue (in case later eval is SUSPENDED)
- SUCCESSFUL: default

This makes the priority explicit: SUSPENDED > FAULTED > SUCCESSFUL
Changes in this release:
- Fix: Propagate SUSPENDED status from inner runtime to evaluation result
- Fix: Remove redundant condition in status propagation logic
- Feat: Add --resume flag for eval command
- Feat: Add comprehensive logging for suspend/resume flow
- Docs: Add interrupt/suspend/resume architecture documentation
- Use eval_item.id as runtime_id (thread_id) for consistent checkpointing
  across suspend and resume invocations
- When --resume flag is set, pass Command(resume=data) to continue from
  interrupt() point instead of starting fresh
- Mock resume data for testing; production orchestrator provides actual
  result data from external work (RPA, HITL, etc.)
- This allows evaluators to execute and produce scores after agent
  completes post-resume

Fixes the issue where evaluators were not running in resume mode.
@github-actions github-actions bot added test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository labels Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:uipath-langchain Triggers tests in the uipath-langchain-python repository test:uipath-llamaindex Triggers tests in the uipath-llamaindex-python repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant